Phonetic Spelling and Heuristic Search
نویسندگان
چکیده
We introduce a new approach to spellchecking for languages with extreme phonetic irregularities. The spelling for such languages can be significantly improved if knowledge about pronunciation and sound becomes the central part of the spelling algorithm. However, given a weak phoneme-grapheme-correspondence the standard spelling algorithms, which are rule-based or editdistance-based, are severely limited in their phonetic capabilities. A production approach to spelling can overcome the limitations— but suffers from its search space size. We describe in this paper the main building blocks to tackle this problem with heuristic search. Our ideas have been operationalized in the SMARTSPELL algorithm, with impressive results related to spelling correction and runtime.
منابع مشابه
Fast Phonetic Similarity Search over Large Repositories
Today there is a large amount of unstructured data produced by information systems from different domains. These sources may be analyzed for different purposes. Existing approaches use string similarity methods to search for valid words within a text, with a supporting dictionary. However, they have two main drawbacks. First, they are not rich enough to encode phonetic information to assist the...
متن کاملA Double Metaphone Encoding for Approximate Name Searching and Matching in Bangla
Almost any word can be a Bangali name, and the name in turn is often spelled in many different ways, all of which are considered correct and interchangeable. The reason for the spelling complication is two-fold: (1) there is a large gap between the script and pronunciation in Bangla, largely attributed to the large scale Sanskritization process that started in the 12 century and continued throu...
متن کاملA New Phonetic Candidate Generator for Improving Search Query Efficiency
Misspelled query due to homophones or mispronunciation is difficult to be corrected in the conventional spelling correction methods. In phonetic candidate generation, the generator is to produce candidates which are phonetically similar to a given query. In this paper, we present a new phonetic candidate generator for improving the search efficiency of a query. The proposed generator consists o...
متن کاملA Language - Independent , Data - OrientedArchitecture for Grapheme - to
We report on an implemented grapheme-to-phoneme conversion architecture. Given a set of examples (spelling words with their associated phonetic representation) in a language, a grapheme-to-phoneme conversion system is automatically produced for that language which takes as its input the spelling of words, and produces as its output the phonetic transcription according to the rules implicit in t...
متن کاملDevelop a Model of Ethical Components of Participation in the Phonetic Behavior of Employees
Background: Employee participation in the organization and hearing their voices is one of the ways to increase and strengthen the spirit of criticism and teamwork with the aim of increasing productivity in organizations. The purpose of this study is to develop a model of ethical components of participation in the phonetic behavior of employees. Method: The research method is heuristic. The sta...
متن کامل